Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö ¼ÒÇÁÆ®¿þ¾î ¹× µ¥ÀÌÅÍ °øÇÐ
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
Ŭ·¯½ºÅ͸µ ±â¹Ý ¾Ó»óºí ¸ðµ¨ ±¸¼ºÀ» ÀÌ¿ëÇÑ ÀÌ»óÄ¡ ŽÁö |
¿µ¹®Á¦¸ñ(English Title) |
Outlier Detection By Clustering-Based Ensemble Model Construction |
ÀúÀÚ(Author) |
¹ÚÁ¤Èñ
±èÅ°ø
±èÁöÀÏ
ÃÖ¼¼¸ñ
ÀÌ°æÈÆ
Cheong Hee Park
Taegong Kim
Jiil Kim
Semok Choi
Gyeong-Hoon Lee
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 07 NO. 11 PP. 0435 ~ 0442 (2018. 11) |
Çѱ۳»¿ë (Korean Abstract) |
ÀÌ»óÄ¡ ŽÁö´Â Á¤»ó µ¥ÀÌÅÍ ºÐÆ÷¸¦ Å©°Ô ¹þ¾î³ª´Â µ¥ÀÌÅÍ »ùÇÃÀ» ŽÁöÇÏ´Â °ÍÀ» ÀǹÌÇÑ´Ù. ´ëºÎºÐÀÇ ÀÌ»óÄ¡ ŽÁö ¹æ¹ýÀº µ¥ÀÌÅÍ »ùÇÃÀÌ Á¤»ó »óŸ¦ ¹þ¾î³ª´Â Á¤µµ¸¦ ³ªÅ¸³»´Â ÀÌ»óÄ¡ Áö¼ö(outlier score)¸¦ °è»êÇÏ¿© ÁÖ¾îÁø ÀÓ°è°ª ÀÌ»óÀÏ ¶§ ÀÌ»óÄ¡·Î ÆÇÁ¤ÇÑ´Ù. ±×·¯³ª, µ¥ÀÌÅ͸¶´Ù ÀÌ»óÄ¡ Áö¼öÀÇ ¹üÀ§°¡ ´Ù¾çÇÏ°í Á¤»ó µ¥ÀÌÅÍ¿¡ ºñÇØ ÀÌ»óÄ¡ µ¥ÀÌÅÍ´Â ÀûÀº ºñÀ²·Î Á¸ÀçÇϱ⠶§¹®¿¡ ÀÌ»óÄ¡ Áö¼ö¿¡ ´ëÇÑ ÀÓ°è°ªÀ» °áÁ¤Çϱâ´Â ¸Å¿ì ¾î·Æ´Ù. ¶ÇÇÑ, ½ÇÁ¦ »óȲ¿¡¼´Â ÇнÀ¿¡ ÀÌ¿ëÇÒ ¼ö ÀÖ´Â ÃæºÐÇÑ ¾çÀÇ ÀÌ»óÄ¡¸¦ Æ÷ÇÔÇÏ´Â µ¥ÀÌÅÍÀÇ È¹µæÀÌ ¿ëÀÌÇÏÁö ¾Ê´Ù. º» ³í¹®¿¡¼´Â Á¤»ó µ¥ÀÌÅÍ°¡ ÁÖ¾îÁ³À» ¶§ À̸¦ ÀÌ¿ëÇÏ¿© Á¤»ó µ¥ÀÌÅÍ ¿µ¿ªÀ» ³ªÅ¸³»´Â ¸ðµ¨À» ±¸¼ºÇÏ°í »õ·Î¿î µ¥ÀÌÅÍ »ùÇÿ¡ ´ëÇØ ÀÌ»óÄ¡¿Í Á¤»óÄ¡ÀÇ ÀÌÁøºÐ·ù¸¦ ¼öÇàÇÏ´Â ¹æ¹ýÀ¸·Î ±ºÁýÈ ±â¹Ý ÀÌ»óÄ¡ ŽÁö ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ±×¸®°í, ÁÖ¾îÁø Á¤»ó µ¥ÀÌÅ͸¦ ûũ·Î ³ª´©°í °¢ ûũ¿¡ ´ëÇØ Å¬·¯½ºÅ͸µ ¸ðµ¨À» ±¸¼ºÇÑ ÈÄ ¸ðµ¨µé¿¡ ÀÇÇÑ ÀÌ»óÄ¡ ÆÇÁ¤ °á°ú¸¦ °áÇÕÇÏ´Â ¾Ó»óºí ¹æ¹ý°ú µ¿Àû º¯È°¡ ÀÖ´Â ½ºÆ®¸®¹Ö µ¥ÀÌÅÍ¿¡¼ÀÇ Àû¿ë ¹æ¹ýÀ¸·Î È®ÀåÇÑ´Ù. ½ÇÁ¦ µ¥ÀÌÅÍ¿Í Àΰø µ¥ÀÌÅ͸¦ ÀÌ¿ëÇÑ ½ÇÇè°á°ú´Â Á¦¾È ¹æ¹ýÀÇ ³ôÀº ¼º´ÉÀ» º¸¿©ÁØ´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
Outlier detection means to detect data samples that deviate significantly from the distribution of normal data. Most outlier detection methods calculate an outlier score that indicates the extent to which a data sample is out of normal state and determine it to be an outlier when its outlier score is above a given threshold. However, since the range of an outlier score is different for each data and the outliers exist at a smaller ratio than the normal data, it is very difficult to determine the threshold value for an outlier score. Further, in an actual situation, it is not easy to acquire data including a sufficient amount of outliers available for learning. In this paper, we propose a clustering-based outlier detection method by constructing a model representing a normal data region using only normal data and performing binary classification of outliers and normal data for new data samples. Then, by dividing the given normal data into chunks, and constructing a clustering model for each chunk, we expand it to the ensemble method combining the decision by the models and apply it to the streaming data with dynamic changes. Experimental results using real data and artificial data show high performance of the proposed method.
|
Å°¿öµå(Keyword) |
½ºÆ®¸®¹Ö µ¥ÀÌÅÍ
¾Ó»óºí ¹æ¹ý
ÀÌ»óÄ¡ ŽÁö
K-Means Clustering
Streaming Data
Ensemble Method
Outlier Detection
K-Means Clustering
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|